Goto

Collaborating Authors

 brief review


The Pros and Cons of Using Machine Learning and Interpretable Machine Learning Methods in psychiatry detection applications, specifically depression disorder: A Brief Review

Simchi, Hossein, Tajik, Samira

arXiv.org Artificial Intelligence

The COVID-19 pandemic has forced many people to limit their social activities, which has resulted in a rise in mental illnesses, particularly depression. To diagnose these illnesses with accuracy and speed, and prevent severe outcomes such as suicide, the use of machine learning has become increasingly important. Additionally, to provide precise and understandable diagnoses for better treatment, AI scientists and researchers must develop interpretable AI-based solutions. This article provides an overview of relevant articles in the field of machine learning and interpretable AI, which helps to understand the advantages and disadvantages of using AI in psychiatry disorder detection applications.


A Brief Review of Hypernetworks in Deep Learning

Chauhan, Vinod Kumar, Zhou, Jiandong, Lu, Ping, Molaei, Soheila, Clifton, David A.

arXiv.org Artificial Intelligence

Hypernetworks, or hypernets in short, are neural networks that generate weights for another neural network, known as the target network. They have emerged as a powerful deep learning technique that allows for greater flexibility, adaptability, dynamism, faster training, information sharing, and model compression etc. Hypernets have shown promising results in a variety of deep learning problems, including continual learning, causal inference, transfer learning, weight pruning, uncertainty quantification, zero-shot learning, natural language processing, and reinforcement learning etc. Despite their success across different problem settings, currently, there is no review available to inform the researchers about the developments and to help in utilizing hypernets. To fill this gap, we review the progress in hypernets. We present an illustrative example to train deep neural networks using hypernets and propose categorizing hypernets based on five design criteria as inputs, outputs, variability of inputs and outputs, and architecture of hypernets. We also review applications of hypernets across different deep learning problem settings, followed by a discussion of general scenarios where hypernets can be effectively employed. Finally, we discuss the challenges and future directions that remain under-explored in the field of hypernets. We believe that hypernetworks have the potential to revolutionize the field of deep learning. They offer a new way to design and train neural networks, and they have the potential to improve the performance of deep learning models on a variety of tasks. Through this review, we aim to inspire further advancements in deep learning through hypernetworks.


Brief Review -- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

#artificialintelligence

ECA-Net clearly outperforms SENet, and also outperforms fixed kernel version of ECA-Net. ECA-Net is superior to SENet and CBAM while it is very competitive to AA-Net with lower model complexity. Note that AA-Net is trained with Inception data augmentation and different setting of learning rates. ECA-Net performs favorably against state-of-the-art CNNs while benefiting much lower model complexity. Different frameworks are used, ECA-Net can well generalize to object detection task.


Brief Review -- Chinchilla: Training Compute-Optimal Large Language Models

#artificialintelligence

On all subsets, Chinchilla outperforms Gopher. On this benchmark, Chinchilla significantly outperforms Gopher despite being much smaller, with an average accuracy of 67.6% (improving upon Gopher by 7.6%). Chinchilla outperforms Gopher by 7.6% on average, performing better on 51/57 individual tasks, the same on 2/57, and worse on only 4/57 tasks. On RACE-h and RACE-m, Chinchilla considerably improves performance over Gopher. On LAMBADA, Chinchilla outperforms both Gopher and MT-NLG 530B.


Brief Review -- Codex: Evaluating Large Language Models Trained on Code

#artificialintelligence

The training dataset was collected in May 2020 from 54 million public software repositories hosted on GitHub, containing 179 GB of unique Python files under 1 MB. Authors filtered out files which were likely auto-generated, had average line length greater than 100, had maximum line length greater than 1000, or contained a small percentage of alphanumeric characters. After filtering, the final dataset totaled 159 GB. The training dataset was collected in May 2020 from 54 million public software repositories hosted on GitHub, containing 179 GB of unique Python files under 1 MB. Authors filtered out files which were likely auto-generated, had average line length greater than 100, had maximum line length greater than 1000, or contained a small percentage of alphanumeric characters.


Brief Review -- Scaling Language Models: Methods, Analysis & Insights from Training Gopher

#artificialintelligence

RMSNorm (Zhang and Sennrich, 2019) instead of LayerNorm, and The relative positional encoding scheme from Dai et al. (2019) is used rather than absolute positional encodings. Relative encodings permit us to evaluate on longer sequences than that is trained on, which improves the modelling of articles and books. The relative positional encoding scheme from Dai et al. (2019) is used rather than absolute positional encodings. Relative encodings permit us to evaluate on longer sequences than that is trained on, which improves the modelling of articles and books.


Brief Review -- LiT: Zero-Shot Transfer with Locked-image text Tuning

#artificialintelligence

The proposed model significantly outperforms the previous state-of-the-art methods at ImageNet zero-shot classification. There are 8.3% and 8.1% improvement over CLIP and ALIGN, respectively. With a pre-trained image model, the proposed setup converges significantly faster than the standard from-scratch setups reported in the literature. LiT provides a way to reuse the already pre-trained models in the literature. It is evident that locking the image tower almost always works best and using a pre-trained image tower significantly helps across the board, whereas using a pre-trained text tower only marginally improves performance, and locking the text tower does not work well.


Brief Review -- An Efficient Solution for Breast Tumor Segmentation and Classification in…

#artificialintelligence

Each BUS image is fed into the trained generative network to obtain the boundary of the tumor, and then 13 statistical features from that boundary are computed: fractal dimension, lacunarity, convex hull, convexity, circularity, area, perimeter, centroid, minor and major axis length, smoothness, Hu moments (6) and central moments (order 3 and below). Exhaustive Feature Selection (EFS) algorithm is used to select the best set of features. The EFS algorithm indicates that the fractal dimension, lacunarity, convex hull, and centroid are the 4 optimal features. The selected features are fed into a Random Forest classifier, which is later trained to discriminate between benign and malignant tumors. Each BUS image is fed into the trained generative network to obtain the boundary of the tumor, and then 13 statistical features from that boundary are computed: fractal dimension, lacunarity, convex hull, convexity, circularity, area, perimeter, centroid, minor and major axis length, smoothness, Hu moments (6) and central moments (order 3 and below).


Brief Review -- Unsupervised Machine Translation Using Monolingual Corpora Only

#artificialintelligence

With the use of GAN idea, NMT model can be trained without parallel data, in which I think it is similar to the CycleGAN in image domain. 2013 … 2018 [UMNT] … 2020 [Batch Augment, BA] [GPT-3] [T5]…